Skip to content

Conversation

@theStack
Copy link
Contributor

@theStack theStack commented Oct 31, 2025

Description

This PR implements BIP352 with scanning limited to full-nodes. Light-client scanning is planned to be added in a separate PR in the future. The following 5 API functions are currently introduced:

Sender side [BIP description]:

  • secp256k1_silentpayments_sender_create_outputs: given a list of $n$ secret keys $a_1 ... a_n$, a serialized outpoint, and a list of recipients (each consisting of silent payments scan pubkey and spend pubkey), create the corresponding transaction outputs (x-only public keys) for the sending transaction

Receiver side, label creation [BIP description]:

  • secp256k1_recipient_create_label: given a scan secret key and label integer, calculate the corresponding label_tweak and label public key
  • secp256k1_recipient_create_labeled_spend_pubkey: given a spend public key and a label public key, create the corresponding labeled spend public key

Receiver side, scanning [BIP description]:

  • secp256k1_recipient_prevouts_summary_create: given a list of $n$ public keys $A_1 ... A_n$ and a serialized outpoint, create a prevouts_summary object needed for scanning
  • secp256k1_recipient_scan_outputs: given a prevouts_summary object, a recipients scan secret key and spend public key, and the relevant transaction outputs (x-only public keys), scan for outputs belonging to the recipients and and return the tweak(s) needed for spending the output(s). Optionally, a label_lookup callback function can be provided to also scan for labels.

For a higher-level overview on what these functions exactly do, it's suggested to look at a corresponding Python implementation that was created based on the secp256k1lab project (it passes the test vectors, so this "executable pseudo-code" should be correct).

Changes to the previous take

Based on the latest state of the previous PR #1698 (take 3), the following changes have been made:

The scope reduction isn't immediately visible in commit count (only one commit was only introducing light-client relevant functionality and could be completely removed), but the review burden compared #1698 is still significantly lower in terms of LOC, especially in the receiving commit.

Open questions / TODOs

@w0xlt
Copy link

w0xlt commented Nov 6, 2025

Added the optimized version on top of this PR:
w0xlt@8d16914

For more context:
#1698 (comment)

@theStack
Copy link
Contributor Author

theStack commented Nov 7, 2025

Small supplementary update: I've created a corresponding Python implementation of the provided API functions based on secp256k1lab (https://github.com/theStack/secp256k1lab/blob/add_bip352_module_review_helper/src/secp256k1lab/bip352.py) (also linked in the PR description). The hope is that this makes reviewing this PR a bit easier by having a less noisy, "executable pseudo-code"-like description on what happens under the hood. The code passes the BIP352 test vectors and hence should be correct.

Added the optimized version on top of this PR: w0xlt@8d16914

For more context: #1698 (comment)

Thanks for rebasing on top of this PR, much appreciated! I will take a closer look within the next days.

Copy link

@w0xlt w0xlt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Not related to optimization, but the diff below removes some redundant public-key serialization code:

diff --git a/src/modules/silentpayments/main_impl.h b/src/modules/silentpayments/main_impl.h
index 106da20..922433d 100644
--- a/src/modules/silentpayments/main_impl.h
+++ b/src/modules/silentpayments/main_impl.h
@@ -21,6 +21,19 @@
 /** magic bytes for ensuring prevouts_summary objects were initialized correctly. */
 static const unsigned char secp256k1_silentpayments_prevouts_summary_magic[4] = { 0xa7, 0x1c, 0xd3, 0x5e };
 
+/* Serialize a ge to compressed 33 bytes. Keeps eckey_pubkey_serialize usage uniform
+ * (expects non-const ge*), and centralizes the VERIFY_CHECK. */
+static SECP256K1_INLINE void secp256k1_sp_ge_serialize33(const secp256k1_ge* in, unsigned char out33[33]) {
+    size_t len = 33;
+    secp256k1_ge tmp = *in;
+    int ok = secp256k1_eckey_pubkey_serialize(&tmp, out33, &len, 1);
+#ifdef VERIFY
+    VERIFY_CHECK(ok && len == 33);
+#else
+    (void)ok;
+#endif
+}
+
 /** Sort an array of silent payment recipients. This is used to group recipients by scan pubkey to
  *  ensure the correct values of k are used when creating multiple outputs for a recipient.
  *
@@ -68,13 +81,11 @@ static int secp256k1_silentpayments_calculate_input_hash_scalar(secp256k1_scalar
     secp256k1_sha256 hash;
     unsigned char pubkey_sum_ser[33];
     unsigned char input_hash[32];
-    size_t len;
     int ret, overflow;
 
     secp256k1_silentpayments_sha256_init_inputs(&hash);
     secp256k1_sha256_write(&hash, outpoint_smallest36, 36);
-    ret = secp256k1_eckey_pubkey_serialize(pubkey_sum, pubkey_sum_ser, &len, 1);
-    VERIFY_CHECK(ret && len == sizeof(pubkey_sum_ser));
+    secp256k1_sp_ge_serialize33(pubkey_sum, pubkey_sum_ser);
     secp256k1_sha256_write(&hash, pubkey_sum_ser, sizeof(pubkey_sum_ser));
     secp256k1_sha256_finalize(&hash, input_hash);
     /* Convert input_hash to a scalar.
@@ -85,15 +96,13 @@ static int secp256k1_silentpayments_calculate_input_hash_scalar(secp256k1_scalar
      * an error to ensure strict compliance with BIP0352.
      */
     secp256k1_scalar_set_b32(input_hash_scalar, input_hash, &overflow);
-    ret &= !secp256k1_scalar_is_zero(input_hash_scalar);
+    ret = !secp256k1_scalar_is_zero(input_hash_scalar);
     return ret & !overflow;
 }
 
 static void secp256k1_silentpayments_create_shared_secret(const secp256k1_context *ctx, unsigned char *shared_secret33, const secp256k1_ge *public_component, const secp256k1_scalar *secret_component) {
     secp256k1_gej ss_j;
     secp256k1_ge ss;
-    size_t len;
-    int ret;
 
     secp256k1_ecmult_const(&ss_j, public_component, secret_component);
     secp256k1_ge_set_gej(&ss, &ss_j);
@@ -103,12 +112,7 @@ static void secp256k1_silentpayments_create_shared_secret(const secp256k1_contex
      * impossible at this point considering we have already validated the public key and
      * the secret key.
      */
-    ret = secp256k1_eckey_pubkey_serialize(&ss, shared_secret33, &len, 1);
-#ifdef VERIFY
-    VERIFY_CHECK(ret && len == 33);
-#else
-    (void)ret;
-#endif
+    secp256k1_sp_ge_serialize33(&ss, shared_secret33);
 
     /* Leaking these values would break indistinguishability of the transaction, so clear them. */
     secp256k1_ge_clear(&ss);
@@ -585,7 +589,6 @@ int secp256k1_silentpayments_recipient_scan_outputs(
                 secp256k1_ge output_negated_ge, tx_output_ge;
                 secp256k1_gej tx_output_gej, label_gej;
                 unsigned char label33[33];
-                size_t len;
 
                 secp256k1_xonly_pubkey_load(ctx, &tx_output_ge, tx_outputs[j]);
                 secp256k1_gej_set_ge(&tx_output_gej, &tx_output_ge);
@@ -595,7 +598,6 @@ int secp256k1_silentpayments_recipient_scan_outputs(
                 secp256k1_ge_neg(&output_negated_ge, &output_ge);
                 secp256k1_gej_add_ge_var(&label_gej, &tx_output_gej, &output_negated_ge, NULL);
                 secp256k1_ge_set_gej_var(&label_ge, &label_gej);
-                ret = secp256k1_eckey_pubkey_serialize(&label_ge, label33, &len, 1);
                 /* Serialize must succeed because the point was just loaded.
                  *
                  * Note: serialize will also fail if label_ge is the point at infinity, but we know
@@ -603,7 +605,7 @@ int secp256k1_silentpayments_recipient_scan_outputs(
                  * Thus, we know that label_ge = tx_output_gej + output_negated_ge cannot be the
                  * point at infinity.
                  */
-                VERIFY_CHECK(ret && len == 33);
+                secp256k1_sp_ge_serialize33(&label_ge, label33);
                 label_tweak = label_lookup(label33, label_context);
                 if (label_tweak != NULL) {
                     found = 1;
@@ -617,7 +619,6 @@ int secp256k1_silentpayments_recipient_scan_outputs(
                 secp256k1_gej_neg(&label_gej, &tx_output_gej);
                 secp256k1_gej_add_ge_var(&label_gej, &label_gej, &output_negated_ge, NULL);
                 secp256k1_ge_set_gej_var(&label_ge, &label_gej);
-                ret = secp256k1_eckey_pubkey_serialize(&label_ge, label33, &len, 1);
                 /* Serialize must succeed because the point was just loaded.
                  *
                  * Note: serialize will also fail if label_ge is the point at infinity, but we know
@@ -625,7 +626,7 @@ int secp256k1_silentpayments_recipient_scan_outputs(
                  * Thus, we know that label_ge = tx_output_gej + output_negated_ge cannot be the
                  * point at infinity.
                  */
-                VERIFY_CHECK(ret && len == 33);
+                secp256k1_sp_ge_serialize33(&label_ge, label33);
                 label_tweak = label_lookup(label33, label_context);
                 if (label_tweak != NULL) {
                     found = 1;

Copy link

@w0xlt w0xlt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: The following diff removes the implicit cast and clarifies that k is 4 bytes

diff --git a/src/modules/silentpayments/main_impl.h b/src/modules/silentpayments/main_impl.h
index 922433d..d94aed6 100644
--- a/src/modules/silentpayments/main_impl.h
+++ b/src/modules/silentpayments/main_impl.h
@@ -512,7 +512,8 @@ int secp256k1_silentpayments_recipient_scan_outputs(
     secp256k1_xonly_pubkey output_xonly;
     unsigned char shared_secret[33];
     const unsigned char *label_tweak = NULL;
-    size_t j, k, found_idx;
+    size_t j, found_idx;
+    uint32_t k;
     int found, combined, valid_scan_key, ret;
 
     /* Sanity check inputs */

Copy link
Contributor

@jonasnick jonasnick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @theStack for the new PR. I can confirm that this PR is a rebased version of #1698, with the light client functionality removed and comments addressed, except for:

@jonasnick
Copy link
Contributor

Not providing prevouts_summary (de)serialization functionality yet in the API poses the risk that users try to do it anyway by treating the opaque object as "serialized". How to cope with that? Is adding a "don't do this" comment in API header sufficient?

Is there a reason for serializing prevouts_summary without light client functionality? If not, I think the don't do this comment is sufficient. Right now, in contrast to the docs of all other opaque objects, this is missing, however:

The exact representation of data inside the opaque data structures is implementation defined and not guaranteed to be portable between different platforms or versions.

@theStack
Copy link
Contributor Author

@w0xlt, @jonasnick: Thanks for the reviews! I've addressed the suggested changes:

  • in _recpient_scan_outputs: changed the type of k to uint32_t (comment above)
  • in _recipient_create_label: added a scan key validity check (+added a test for that) (#1698 - comment)
  • unified all mentions of "Silent Payments" to title case in the header API and example (#1698 - comment)
  • fixed typo s/elemement/element/ (#1698 - review)
  • in _recipient_scan_outputs: fixed comment in second label candidate (review above)
  • extended the API header comment for the _prevouts_summary opaque data structure, to point out that the data structure is implementation defined (like docs of all other opaque structs) (comment above)

Nit: Not related to optimization, but the diff below removes some redundant public-key serialization code:

Given that this compressed-pubkey-serialization pattern shows up repeatedly also in other modules (ellswift, musig), I think it would make the most sense to add a general helper (e.g. in eckey{,_impl}.h), which could be done in an independent PR. I've opened issue #1773 to see if there is conceptual support for doing this.

Not providing prevouts_summary (de)serialization functionality yet in the API poses the risk that users try to do it anyway by treating the opaque object as "serialized". How to cope with that? Is adding a "don't do this" comment in API header sufficient?

Is there a reason for serializing prevouts_summary without light client functionality? If not, I think the don't do this comment is sufficient.

Good point, I can't think of a good reason for full nodes wanting to serialize prevouts_summary.

@nymius
Copy link

nymius commented Nov 20, 2025

To address the open questions, I’ve reviewed the proposed changes by @w0xlt on 8d16914.

I'm going to focus more on the key aspects I extracted from the review and the merits of each change, rather on the big O improvement claims, because I didn't get that far.

These are multiple different changes rather than a single one, so to make the review easier I suggest to brake it in multiple commits. I would state on each of them the purpose and the real case scenario where the change would be relevant.

Also, I would use clearer names for the variables or at least document their purpose.

The changes I've identified so far are the following:

  • Improve label lookup using hash table: I think this is implementation dependent and should be improved by the user rather than by the library itself.
    Examples, as part of the documentation, are a usage demonstration, although can point the user the best practices, I prefer clarity rather than performance on them. For example, on the rust-secp256k1 bindings, I used a HashMap for the example because it is a familiar standard structure for Rust users. If they would like to gain more performance there, they have other tools available to replace that structure by themselves.
  • On secp256k1_silentpayments_recipient_sort_cmp, I understood the change: (r1->index < r2->index) ? -1 : (r1->index > r2->index) ? 1 : 0 is to make the unstable secp256k1_hsort implementation stable. Considering it only affects secp256k1_silentpayments_sender_create_outputs, which didn't receive more changes than a variable removal, what are the performance improvements there?
  • SECP256K1_SP_SCAN_BATCH: I just learned about it (ge_set_gej_all) during this review, but seems that affine to jacobian conversion is needed for the comparison against labels, and is faster to do it in batches. I think the impact of this improvement will only affect the small subset of transaction with large amount of outputs. A good benchmark for this would be coinjoin transactions, although is not supported by BIP 352, a test case is doable.
  • Double head: I'm not sure of the target of this change, I guess is to skip already found or scanned inputs, but couldn't figure out what is tracking each head.
  • Binary tree search for x-only lookups: I think it explain itself, faster lookups on the not labeled case. This may have its merits, but I need to remove the other changes to have a clear answer.

In general I agree with @jonasnick that we should define a clear target to benchmark and improve. As I've said before, the base case should be a wallet with a single label for change.
For other improvements, I would try to match up plausible real world scenarios before making complex changes in the code base of the PR.
Finally, by looking at bench_silentpayments_full_tx_scan, the use_labels case is very simple. If we want to test performance improvements on the label lookup, I would start there.

In conclusion, from the proposed commit and the discussion around it, the only changes I've found clear enough to consider are:

  • Tracking of found outputs: simple enough, small performance improvement for the usual case, better for larger transactions.

@w0xlt
Copy link

w0xlt commented Nov 20, 2025

Thanks @nymius for reviewing the changes, addressing the main points, and proposing a simplification.

I’m currently splitting the optimization commit into smaller pieces to make it easier to review.
I’ll also take a closer look at your commit and run it against the benchmark files.

The only part of the discussion that still feels a bit ambiguous is the “base” or “usual” case.
From my understanding of #1698 (review), the concern is not about typical usage, but rather about an attacker crafting malicious transactions with many outputs, causing the scanning process to take hours.

So the goal of this optimization would be to mitigate that scenario, not the collaborative one.

@w0xlt
Copy link

w0xlt commented Nov 21, 2025

I ran the examples/silentpayments_mixed_1.c file with simplified the version suggested by @nymius jonasnick@311b4eb . It shows slightly worse performance (see below), but the simpler approach may still be worth it

Without the secp256k1_silentpayments_recipient_sort_cmp stabilization, I got 82s for the complex version vs. 114s for the simpler one. Whether the 30s difference justifies the additional complexity is up for discussion — I don’t have a strong opinion.

Answering the questions: secp256k1_silentpayments_recipient_sort_cmp stabilization + heads speed up the non-adversarial case. The stable sort ensures that the transaction outputs are ordered sequentially by the index $k$.

The optimized receiver implementation relies on a heuristic (the head pointers) that assumes the next output it is looking for ($k+1$) is located immediately after the previous one ($k$).

  • With Stable Sort: The scanner complexity is roughly O(N) (Linear).
  • Without Stable Sort: The scanner complexity degrades to O(N²) (Quadratic).

@theStack
Copy link
Contributor Author

@w0xlt, @nymius: Thanks for investigating this deeper. I've now also had a chance to look at the suggested optimizations and came to similar conclusions as stated in #1765 (comment). I particularly agree with the stated points that the changes should not increase complexity significantly and that the most important optimization candidate to consider for mitigating the worst-case scanning attack is "skip outputs that we have already found" (as previously stated by @jonasnick, see #1698 (comment) and jonasnick@311b4eb). I don't think stabilizing the sorting helps at all, since this is something that happens at the sender side, and we can't rely on the attacker using a specific implementation (even if they did, it's trivial for them to shuffle the outputs after creation).

For the proposed target to benchmark, I'm proposing the following modified example that exhibits the worst-case scanning time based on a labels cache with one entry (for change outputs), by creating a tx with 23255 outputs [1] all targeted for Bob: 1df4287

Shower-thought from this morning: what if we treat the tx_outputs input as actual list that we modify, and remove an entry if it is found? This would have a similar effect as the "track found outputs" idea, but without the need of dynamic memory allocation. It's a ~10-lines diff and seems to work fine: theStack@9aba470
It reduces the run-time of the proposed example above from ~10 minutes to roughly ~2 seconds on my machine.

Any thoughts on this? Maybe I'm still missing something.

[1] that's an upper bound of maximum outputs per block: floor(1000000/43) = 23255

@nymius
Copy link

nymius commented Nov 21, 2025

Shower-thought from this morning: what if we treat the tx_outputs input as actual list that we modify, and remove an entry if it is found? This would have a similar effect as the "track found outputs" idea, but without the need of dynamic memory allocation. It's a ~10-lines diff and seems to work fine: theStack@9aba470 It reduces the run-time of the proposed example above from ~10 minutes to roughly ~2 seconds on my machine.

Any thoughts on this? Maybe I'm still missing something.

1df4287 is a good target.
These are the runtime times I've obtained testing against this target:

Branch Runtime
theStack/secp256k1@9aba470 ~1.41s
theStack/secp256k1@9103229 (baseline) ~9m
jonasnick/secp256k1@311b4eb ~1.49s

I had to increase stack size to be able to fit all N_OUTPUT size allocations in the example.

Initially I preferred the is_found allocation rather than the element shifts. But your solution seems to be more performant.

@w0xlt
Copy link

w0xlt commented Nov 22, 2025

@theStack Yes — if we want to keep only the adversarial-scenario optimizations, we can drop sort stabilization and the extra heads.

I like your idea of avoiding dynamic memory allocation. That’s a very interesting direction. On my machine, the scan completes in about 0.4s, which feels like a good balance between simplicity and the optimization needed for the labeled case.

Below are the changes I had to make for your example to run on my machine and to record the scan time.

diff --git a/examples/silentpayments.c b/examples/silentpayments.c
index 5e71e73..d43332f 100644
--- a/examples/silentpayments.c
+++ b/examples/silentpayments.c
@@ -10,6 +10,7 @@
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
+#include <time.h>
 
 #include <secp256k1_extrakeys.h>
 #include <secp256k1_silentpayments.h>
@@ -112,15 +113,21 @@ const unsigned char* label_lookup(
     return NULL;
 }
 
+static secp256k1_xonly_pubkey tx_inputs[N_INPUTS];
+static const secp256k1_xonly_pubkey *tx_input_ptrs[N_INPUTS];
+static secp256k1_xonly_pubkey tx_outputs[N_OUTPUTS];
+static secp256k1_xonly_pubkey *tx_output_ptrs[N_OUTPUTS];
+static secp256k1_silentpayments_found_output found_outputs[N_OUTPUTS];
+static secp256k1_silentpayments_found_output *found_output_ptrs[N_OUTPUTS];
+static secp256k1_silentpayments_recipient recipients[N_OUTPUTS];
+static const secp256k1_silentpayments_recipient *recipient_ptrs[N_OUTPUTS];
+/* 2D array for holding multiple public key pairs. The second index, i.e., [2],
+ * is to represent the spend and scan public keys. */
+static unsigned char (*sp_addresses[N_OUTPUTS])[2][33];
+
 int main(void) {
     unsigned char randomize[32];
     unsigned char serialized_xonly[32];
-    secp256k1_xonly_pubkey tx_inputs[N_INPUTS];
-    const secp256k1_xonly_pubkey *tx_input_ptrs[N_INPUTS];
-    secp256k1_xonly_pubkey tx_outputs[N_OUTPUTS];
-    secp256k1_xonly_pubkey *tx_output_ptrs[N_OUTPUTS];
-    secp256k1_silentpayments_found_output found_outputs[N_OUTPUTS];
-    secp256k1_silentpayments_found_output *found_output_ptrs[N_OUTPUTS];
     secp256k1_silentpayments_prevouts_summary prevouts_summary;
     secp256k1_pubkey unlabeled_spend_pubkey;
     struct labels_cache bob_labels_cache;
@@ -209,11 +216,6 @@ int main(void) {
     {
         secp256k1_keypair sender_keypairs[N_INPUTS];
         const secp256k1_keypair *sender_keypair_ptrs[N_INPUTS];
-        secp256k1_silentpayments_recipient recipients[N_OUTPUTS];
-        const secp256k1_silentpayments_recipient *recipient_ptrs[N_OUTPUTS];
-        /* 2D array for holding multiple public key pairs. The second index, i.e., [2],
-         * is to represent the spend and scan public keys. */
-        unsigned char (*sp_addresses[N_OUTPUTS])[2][33];
         unsigned char seckey[32];
 
         printf("Sending...\n");
@@ -340,6 +342,9 @@ int main(void) {
              *        `secp256k1_silentpayments_recipient_prevouts_summary_create`
              *     2. Call `secp256k1_silentpayments_recipient_scan_outputs`
              */
+            clock_t start, end;
+            double cpu_time_used;
+
             ret = secp256k1_silentpayments_recipient_prevouts_summary_create(ctx,
                 &prevouts_summary,
                 smallest_outpoint,
@@ -356,14 +361,20 @@ int main(void) {
 
             /* Scan the transaction */
             n_found_outputs = 0;
+            
+            start = clock();
             ret = secp256k1_silentpayments_recipient_scan_outputs(ctx,
                 found_output_ptrs, &n_found_outputs,
-                (const secp256k1_xonly_pubkey * const *)tx_output_ptrs, N_OUTPUTS,
+                (const secp256k1_xonly_pubkey **)tx_output_ptrs, N_OUTPUTS,
                 bob_scan_key,
                 &prevouts_summary,
                 &unlabeled_spend_pubkey,
                 label_lookup, &bob_labels_cache /* NULL, NULL for no labels */
             );
+            end = clock();
+            cpu_time_used = ((double) (end - start)) / CLOCKS_PER_SEC;
+            printf("Bob's scan took %f seconds\n", cpu_time_used);
+            
             if (!ret) {
                 printf("This transaction is not valid for Silent Payments, skipping.\n");
                 return EXIT_SUCCESS;
@@ -435,7 +446,7 @@ int main(void) {
             n_found_outputs = 0;
             ret = secp256k1_silentpayments_recipient_scan_outputs(ctx,
                 found_output_ptrs, &n_found_outputs,
-                (const secp256k1_xonly_pubkey * const *)tx_output_ptrs, 1, /* dummy scan with one output (we only care about Bob) */
+                (const secp256k1_xonly_pubkey **)tx_output_ptrs, 1, /* dummy scan with one output (we only care about Bob) */
                 carol_scan_key,
                 &prevouts_summary,
                 &unlabeled_spend_pubkey,

@theStack
Copy link
Contributor Author

@nymius, @w0xlt: Thanks once again for the quick feedback and for benchmarking! Shortly after my previous comment, I've been notified about yet another approach to tackle the worst-case scanning time attack (kudos to @furszy for bringing up the idea!), that I think is even more elegant: we can use the pointers in the tx_outputs list directly to track outputs by setting them to NULL if one has been found, and accordingly only treat them if they are non-NULL. With this, it's an only four lines of code change: theStack@2087f92. It kind of combines the previous two approaches of jonasnick@311b4eb and theStack@9aba470 (-> mark spent outputs, but not in a newly allocated array, but by modifying the tx_outputs input list, in order to avoid dynamic memory allocation), with very similar run-time results.

The only tiny drawback about these non-malloc approaches might be that something that is conceptually an "in" parameter is modified, which might be a bit unsound in a strict API design sense. On the other hand, it shouldn't matter for the user (I doubt that these lists passed in would ever be reused for anything else after by the callers), and we already do the same in the sending API for the recipients, so it's probably fine.

@theStack Yes — if we want to keep only the adversarial-scenario optimizations, we can drop sort stabilization and the extra heads.

The way I see it currently, code paths for non-adversarial scenarios with increasing k values would be hit so rarely in practice, that I'm sceptical that it's worth it put much effort into those optimizations. When scanning, the vast majority of transactions won't have any matches in the first place. Out of those few that do have a match, the vast majority will very likely again not contain any repeated recipient (IMHO it doesn't make that much sense to do that, unless the recipient explicitly asks "I want to receive my payment split up in multiple UTXOs, but still in a single tx"?), so in the bigger picture those optimizations wouldn't matter all that much, and I'd assume that the dominant factor should be by far all the (unavoidable) ECDH computations per transaction. But that's still more of a guess and it's still good to already have optimization ideas at hand if we need them in the future.

@w0xlt
Copy link

w0xlt commented Nov 23, 2025

@theStack Thanks for continuing to refine the optimization. The deletion approach performs slightly better (0.40 s vs. 0.45 s), likely because deleting items shrinks the array and cuts the number of loop iterations by about 50% compared to nullifying them.

theStack added a commit to theStack/secp256k1 that referenced this pull request Nov 25, 2025
assuming the labels cache has only one entry (for change) for now

includes fixes by w0xlt in order to avoid running into a stack overflow
and time measureing code, see
bitcoin-core#1765 (comment)
theStack added a commit to theStack/secp256k1 that referenced this pull request Nov 25, 2025
assuming the labels cache has only one entry (for change) for now

includes fixes by w0xlt in order to avoid running into a stack overflow
and time measureing code, see
bitcoin-core#1765 (comment)
@theStack theStack force-pushed the silentpayments_module_fullnode_only branch from 9103229 to 650b2fb Compare November 25, 2025 19:07
@theStack
Copy link
Contributor Author

theStack commented Nov 25, 2025

To summarize, the following table shows the proposed mitigations for the worst-case scanning attack so far, with benchmark results from my machine. The previous baseline commit with the worst-case example has been updated to include @w0xlt's changes, in order to work without stack size limit changes.
(EDIT: These benchmark results are based on a baseline that doesn't represent the worst-case and are thus not representative, as noticed by w0xlt below.)

Branch Approach Runtime
theStack@c16252d (Branch baseline) modified example to exercise worst-case scanning, no fix 641.391838s
theStack@ec27977 (Branch fix1_...) mark found outputs in calloced array 0.543969s
theStack@135ca0a (Branch fix2_...) remove matched outputs by shifting remaining entries 0.514952s
theStack@8360150 (Branch fix3_...) mark found outputs by NULL in tx_outputs input array 0.544740s

The run-times of the fixes vary slightly (the removal approach "fix2" being the fastest, confirming #1765 (comment) above), but are all in the same ballpark. I don't think exact performance results matter much here, as the goal of the mitigation should be to IMHO roughly cut the run-time down from "minutes" to "seconds" (and remember, this is already for the absolute worst-case, one giant non-standard transaction filling out a whole block, and it can only slow down one specific SP recipient). Thus, I decided to pick the the simplest approach that avoids dynamic memory allocation, i.e. fix number 3 using NULL as marker in tx_outputs.

With that tackled, I believe that all of the open questions and TODOs are addressed now (updated the PR description accordingly). The latest force-push also includes a rebase on master (to include the CI fix #1771).

theStack and others added 11 commits November 27, 2025 18:36
Add a routine for the entire sending flow which takes a set of private keys,
the smallest outpoint, and list of recipients and returns a list of
x-only public keys by performing the following steps:

1. Sum up the private keys
2. Calculate the input_hash
3. For each recipient group:
    3a. Calculate a shared secret
    3b. Create the requested number of outputs

This function assumes a single sender context in that it requires the
sender to have access to all of the private keys. In the future, this
API may be expanded to allow for a multiple senders or for a single
sender who does not have access to all private keys at any given time,
but for now these modes are considered out of scope / unsafe.

Internal to the library, add:

1. A function for creating shared secrets (i.e., a*B or b*A)
2. A function for generating the "SharedSecret" tagged hash
3. A function for creating a single output public key
Add function for creating a label tweak. This requires a tagged hash
function for labels. This function is used by the receiver for creating
labels to be used for a) creating labeled addresses and b) to populate
a labels cache when scanning.

Add function for creating a labeled spend pubkey. This involves taking
a label tweak, turning it into a public key and adding it to the spend
public key. This function is used by the receiver to create a labeled
silent payment address.

Add tests for the label API.
Add routine for scanning a transaction and returning the necessary
spending data for any found outputs. This function works with labels via
a lookup callback and requires access to the transaction outputs.
Requiring access to the transaction outputs is not suitable for light
clients, but light client support is enabled in the next commit.

Add an opaque data type for passing around the prevout public key sum
and the input hash tweak (input_hash). This data is passed to the scanner
before the ECDH step as two separate elements so that the scanner can
multiply the scan_key * input_hash before doing ECDH.

Finally, add test coverage for the receiving API.
Demonstrate sending and scanning on full nodes.
Add a benchmark for a full transaction scan.
Only benchmarks for scanning are added as this is the most
performance critical portion of the protocol.

Co-authored-by: Sebastian Falbesoner <[email protected]>
Add the BIP-352 test vectors. The vectors are generated with a Python script
that converts the .json file from the BIP to C code:

$ ./tools/tests_silentpayments_generate.py test_vectors.json > ./src/modules/silentpayments/vectors.h

Co-authored-by: Ron <[email protected]>
Co-authored-by: Sebastian Falbesoner <[email protected]>
Co-authored-by: Tim Ruffing <[email protected]>
Co-authored-by: Jonas Nick <[email protected]>
Co-authored-by: Sebastian Falbesoner <[email protected]>
Test midstate tags used in silent payments.
@theStack theStack force-pushed the silentpayments_module_fullnode_only branch from 650b2fb to f96f41f Compare November 27, 2025 23:45
@theStack
Copy link
Contributor Author

I assume you want to rebase on master now that #1774 has been merged.

Yes, done. Using the new secp256k1_eckey_pubkey_serialize33 function, the sending and receiving commit diffs got a bit smaller (as outlined in #1765 (review)), -13 LOC in total.

@nymius
Copy link

nymius commented Nov 29, 2025

Thanks for the summary, I've checked the results. I agree with fix3_* branch as the final solution, is elegant and trade-offs are clear.

@w0xlt
Copy link

w0xlt commented Nov 30, 2025

Hi @theStack . That's not actually the worst case. The worst case is when the attacker shuffles the outputs randomly (or in any adversarial order). I modified your benchmark to randomly shuffle outputs, and scanning takes several minutes to hours with the current optimization proposal.

w0xlt@435cb96

The issue is that the scanning function receives a label lookup callback, which makes it harder to build a sorted index upfront. Instead, for each output tweak index k, it linearly scans all transaction outputs — resulting in O(n²) complexity. An attacker creating ~23,000 unordered outputs can make scanning take hours.

If we slightly change the API to replace the label lookup callback with a label entry set, we can sort once and search fast:

  1. Build a sorted index of outputs by serialized x-only pubkey — O(n log n) upfront
  2. Binary search for each candidate — O(log n) per lookup


That way, scanning 23,255 adversarially-ordered outputs drops from hours to ~0.3s. The optimization is purely receiver-side, requires no protocol changes, and works regardless of output ordering. This reduces architectural flexibility, though.

Before: for each k, scan all n outputs → O(k × n) ≈ O(n²)
After: sort outputs, binary search → O(n log n + k log n)



The following commit implements this proposal:


w0xlt@2305d73

You can verify by running ./build/bin/silentpayments_example (ordered) and ./build/bin/silentpayments_shuffled_example (shuffled).

@theStack
Copy link
Contributor Author

theStack commented Dec 1, 2025

Hi @theStack . That's not actually the worst case. The worst case is when the attacker shuffles the outputs randomly (or in any adversarial order).

@w0xlt: Good catch, I indeed missed involving the output order for the worst-case scenario, so theStack@c16252d is not a useful target (in hindsight, the benchmark results look too good to be true, and my previous words of "we can't rely on the attacker using a specific implementation" strike back very hard). Thanks for the updated benchmark and the new worst-case fix proposal!

I modified your benchmark to randomly shuffle outputs, and scanning takes several minutes to hours with the current optimization proposal.

Are you sure it could take up "to hours"? That would be significantly worse than the previous baseline commit (~10 minutes on my machine), which seems implausible to me. In the unoptimized scanning implementation, the run-time should be very much independent on the order of outputs, as the total number of inner loop iterations is constant ($1+2+3...+N = (N*(N+1))/2 = 270409140$ for our worst-case scenario of $N = 23255$), and it'd be very surprising if any of the three proposed optimizations perform worse than the baseline for shuffled outputs.

I only did two runs so far with your shuffle-patch added, and got runtimes of ~10 minutes (unoptimized) and ~5 minutes (NULL-ify patch, "fix3"). Will do some more thorough benchmarks tomorrow, including your proposed solution.

@w0xlt
Copy link

w0xlt commented Dec 1, 2025

@theStack Thanks for looking into this.
Sorry if I wasn’t clear earlier. The shuffled benchmark does perform better in the optimized version than in the baseline, but it loses the sub-second scan times.
The new proposed approach (assuming it works as expected) should keep those sub-second times even on the shuffled benchmark.

@theStack
Copy link
Contributor Author

theStack commented Dec 1, 2025

@w0xlt: I've looked at your proposed scanning implementation w0xlt@2305d73 and can confirm that it fixes the worst-case scanning attack, as the run-time of the modified example with shuffled outputs is reduced to <1s. What makes this different approach potentially interesting is that it could also to speed up the common case (i.e. no matches), as the number of point additions needed for labels scanning seems to be significantly smaller for the small-scale user (where N_LABELS < 2 * N_OUTPUTS for the average transaction; if N_LABELS=1 and we assume the average transaction to scan for has N_OUTPUTS=2, your approach is 4x faster w.r.t. needed point additions). To verify that I understood it correctly and make the idea digestible to other interested reviewers, I've tried to summarize the current PR's scanning approach (as proposed in BIP-352) vs. your approach on the following gist, written in a somewhat hand-wavy way, I hope the pseudo-code is understandable: https://gist.github.com/theStack/25c77747838610931e8bbeb9d76faf78. Please let me know if I got that right, I think the approach leads to the same result, but I'm still trying to wrap my head around it.
// EDIT: I've been told by @setavenger that this approach is also used by Frigate Electrum Server and the BlindBit wallet.

On the other hand, I'm still uncertain on whether adding the additional code complexity and the involvement of dynamic memory allocation at all in a secp256k1 module is worth all this, considering that we are talking about an attack that costs an adversary money (at least the fees for the whole block space) and can only slow down one specific SP recipient. It seems a very unusual kind of attack. Changing the scanning API at this point seems also questionable, as I think it went already through several iterations and refinements over the years, and I think the "pass labels directly" approach is slower for users with a higher number of labels involved. Also, the mental load for reviewers is significantly higher if the approach doesn't follow the algorithm as specified in BIP-352, as they have to convince themselves that the result is equivalent.

I'm hence debating on whether one solution could be to keep the PR simply as-is and accept the fact that an adversary can slow down a single entity for a few minutes (for non-standard transactions), to a few seconds (for standard transactions, i.e. tx size <= 100kvB). I don't have a good final answer to this question myself, but want to discuss is at least and come to a decision that both maintainers and users are on board with before taking any further action.

My (in retrospect somewhat naive) hope was that there could be a simple few-lines fix for reducing the worst-case scanning attack down to seconds without the need of dynamic memory allocation, but such an easy fix doesn't seem to be in sight.

@w0xlt
Copy link

w0xlt commented Dec 2, 2025

Hi @theStack thanks a lot for taking the time to review the proposal and write up the comparison.

My view is that we should aim for the most efficient and resilient solution, even if the attack scenario is somewhat unusual. The spec can be updated as needed to reflect improvements. Independently of adversarial cases, reducing scanning time from several minutes to under a second is a major win for user experience.

Regarding complexity: although this approach adds some moving parts, the underlying idea (e.g., using binary search) is conceptually straightforward, so I don’t think it should pose a significant burden by itself.

As for the dynamic memory concern, the allocation doesn’t necessarily have to happen inside the secp256k1 module. We could let the caller provide the required workspace. For example, could we leverage secp256k1_scratch_space here?

@theStack
Copy link
Contributor Author

theStack commented Dec 2, 2025

@w0xlt: After tinkering a bit with your proposed implementation w0xlt@2305d73, I found that it could be significantly simplified, and it turned out we don't even need dynamic memory allocation:

  • instead of building an explicit index, we can simply sort the user-supplied tx_outputs array of x-only pubkeys in place and perform binary search directly on that (using the existing secp256k1_pubkey_xonly_pubkey_cmp)
  • the preparation of the label set group elements array is not needed as well; _pubkey_load is a very cheap operation, so it doesn't hurt to call it repeatedly in each k loop iteration (remember that the common case is not having a match, i.e. k doesn't exceed 0 for most scanned transactions)

Doing this in theStack@1a40e14 still keeps the worst-case scanning benchmark time down at ~450ms on my machine.
Admittedly, there is a tiny bit of run-time overhead introduced with the simplification (repeated x-only pubkey serialization for the common case), but I don't think that matters much, considering that the involved elliptic curve operations are likely orders of magnitude more expensive than that. With these new insights, I can retract my doubts about code complexity and dynamic memory allocation of your proposed approach, and it's good to have two potential scanning implementations. 👍

It should be pointed out that the "label set" approach doesn't come for free though: one major drawback is that for users with a larger label set, the typical scanning scenario (assuming most txs don't have more than just a few outputs) can be slower at some point, as the number of point additions is higher than with the "BIP" approach. So if we follow your approach, it's better for small-case users and to eliminate the worst-case scanning attack, but users with a lot of labels are likely not happy with it and would very much prefer the "BIP" scanning approach. Not really sure how to deal with that and what to prioritize. (Provide two scanning implementations and let the user pick? 😬 )

I'll anyways try to come up with another "modified example" benchmark to better verify the claims about typical scanning scenarios, depending on the number of labels involved and transaction output count. We have spent a good time on the worst-scanning scenario now, but it would be wrong to solely optimize for that and get the typical scanning scenario out of sight.

Independently of adversarial cases, reducing scanning time from several minutes to under a second is a major win for user experience.

I don't agree with that statement, as I can't think of a legit non-adversarial case that would lead to scanning times of several minutes, in particular as this would at the minimum involve creating a large non-standard transaction. When considering user experience, by far the most relevant case I'd consider is the "no match" scenario (one k loop iteration), as it is the most frequent one, followed by the "single match" (two k loop iterations) scenario. Everything above that should be already quite rare in practice.

As for the dynamic memory concern, the allocation doesn’t necessarily have to happen inside the secp256k1 module. We could let the caller provide the required workspace. For example, could we leverage secp256k1_scratch_space here?

Good question. Right now, the scratch space API is not even exposed to the user. Btw I don't think dynamic memory allocation is a strict no-go in general, it just seems that there are still some open questions around that area, and waiting until they are all tackled would probably delay the PR for an unforeseeable time. It seems that as of now we can easily avoid dynamic memory allocation anyways though, in both of the proposed implementations.

@w0xlt
Copy link

w0xlt commented Dec 2, 2025

Not really sure how to deal with that and what to prioritize. (Provide two scanning implementations and let the user pick? 😬 )

Some (very) rough ideas on the trade-off…

If we can define a general rule that determines when it’s better to use the label-set scan versus the BIP-based scan, we could do something like:

/* Choose algorithm based on this transaction's characteristics */
if (n_labels > 2 * n_tx_outputs) {
    /* BIP approach: iterate over outputs, add, and look up in label callback */
    use_bip_scan();
} else {
    /* Label-set approach: iterate over labels, add, and search in output index */
    use_labelset_scan();
}

The heuristic above is just a rough estimate. The BIP version performs roughly ~2N point additions, where N is the number of outputs, while the label-set version performs about ~L additions, where L is the number of configured labels.

But to build a reliable decision rule, we’d need more benchmark data across a range of (N, L) combinations.

@theStack
Copy link
Contributor Author

theStack commented Dec 3, 2025

Here is a first attempt at creating a scanning benchmark for the common case scenario (i.e. no matches and hence only one k iteration), comparing the "BIP" and "Label-set" approaches over a combination of L (number of labels to scan for) and N (number of tx outputs): theStack@fef63fd. Running this modified example outputs the following on my arm64 machine:

$ ./build/bin/silentpayments_example
Silent Payments (BIP-352) scanning benchmarks
[common case scenario, i.e. only one k iteration without match]

Legend: L... number of labels, N... number of transaction outputs

===== BIP approach (calculate label candidates for each output, look them up in labels cache) =====
L= 1: [N=2:  69 us] [N=5:  82 us] [N=10:  89 us] [N=20: 121 us] [N=50: 197 us] [N=100: 343 us] [N=200: 632 us]
L= 2: [N=2:  61 us] [N=5:  80 us] [N=10:  78 us] [N=20: 101 us] [N=50: 177 us] [N=100: 314 us] [N=200: 566 us]
L= 3: [N=2:  54 us] [N=5:  58 us] [N=10:  76 us] [N=20:  95 us] [N=50: 166 us] [N=100: 288 us] [N=200: 535 us]
L= 5: [N=2:  51 us] [N=5:  55 us] [N=10:  67 us] [N=20: 160 us] [N=50: 159 us] [N=100: 271 us] [N=200: 496 us]
L=10: [N=2:  49 us] [N=5:  55 us] [N=10:  66 us] [N=20:  89 us] [N=50: 158 us] [N=100: 276 us] [N=200: 494 us]
L=20: [N=2:  54 us] [N=5:  55 us] [N=10:  68 us] [N=20:  90 us] [N=50: 159 us] [N=100: 271 us] [N=200: 505 us]
L=50: [N=2:  49 us] [N=5:  55 us] [N=10:  71 us] [N=20:  88 us] [N=50: 160 us] [N=100: 274 us] [N=200: 504 us]

===== Label-set approach (calculate output candidate for each label, look it up in outputs) =====
L= 1: [N=2:  51 us] [N=5:  47 us] [N=10:  48 us] [N=20:  51 us] [N=50:  59 us] [N=100:  77 us] [N=200: 116 us]
L= 2: [N=2:  49 us] [N=5:  57 us] [N=10:  50 us] [N=20:  53 us] [N=50:  65 us] [N=100:  81 us] [N=200: 120 us]
L= 3: [N=2:  49 us] [N=5:  51 us] [N=10:  50 us] [N=20:  56 us] [N=50:  63 us] [N=100:  82 us] [N=200: 118 us]
L= 5: [N=2:  58 us] [N=5:  52 us] [N=10:  55 us] [N=20:  56 us] [N=50:  68 us] [N=100:  81 us] [N=200: 119 us]
L=10: [N=2:  63 us] [N=5:  59 us] [N=10:  61 us] [N=20:  62 us] [N=50:  71 us] [N=100:  87 us] [N=200: 126 us]
L=20: [N=2:  70 us] [N=5:  71 us] [N=10:  74 us] [N=20:  90 us] [N=50:  83 us] [N=100: 101 us] [N=200: 139 us]
L=50: [N=2: 104 us] [N=5: 107 us] [N=10: 111 us] [N=20: 113 us] [N=50: 122 us] [N=100: 141 us] [N=200: 192 us]

These results should be taken with a large grain of salt, as the numbers fluctuate a lot (visible for the BIP approach, where all lines should have about the same results, as the performance is in theory independent of L; also, for the first two L the results are surprisingly worse, maybe some unintended caching effects for later runs?) but one can at least see the general previously suspected trend that the BIP approach gets worse with increasing N, while the label-set approach gets worse with increasing L. As a very rough first statement, I'd say assuming that users don't use more than a handful (let's say up to 10) labels and transactions with more outputs get more common in the future, using the label-set approach seems superior, and the only reason to keep the BIP approach is to target users that want to scan for a large number of (let's say dozens) of labels.

Comments or ideas on how to best reason about this would be much appreciated (also a rough review of the modified benchmark example, it could also be that there are bugs). Maybe someone has an idea how to connect this with some actual historic on-chain stats? The list of L and N values to benchmark for can be adapted by changing the two arrays n_labels_bench and n_outputs_bench at the top of the example file.

If we can define a general rule that determines when it’s better to use the label-set scan versus the BIP-based scan, we could do something like:

Makes sense yeah, thought about something similar. What makes this a bit inconvenient (aside from additional review and maintenance burden, obviously) is that users would have to pass both the labels set and implement and pass in a call-back functions, which seems to result in a bloated API. Maybe it's better to focus on one approach and ship another scan function for "giant label set" power users later? I could be wrong, but I don't think using more than three or four labels would be a very common use-case.

@w0xlt
Copy link

w0xlt commented Dec 4, 2025

@theStack Thanks very much for this benchmark. It is very insightful.

From my understanding, your benchmark confirms that the label-set approach performs better overall and scales more smoothly with the transaction size N (number of outputs). For small N they are comparable, but as N grows the label-set approach becomes significantly faster than the BIP approach.

I also evaluated Montgomery batch inversion via secp256k1_ge_set_all_gej_var in the label-set scanning path on top of your commit. This seems to mainly help when there are many labels (large L). For L = 50 the label-set approach becomes noticeably faster still, so it dominates the BIP-style scanning for almost all (L, N) combinations in this benchmark. All runs are using the same “common case” scenario as in your example (one k iteration, no match).

w0xlt@1e8d196

Below is the benchmark commit as measured on my machine (no batch inversion):

===== BIP approach (calculate label candidates for each output, look them up in labels cache) =====
L= 1: [N=2: 351 us] [N=5: 200 us] [N=10: 231 us] [N=20: 261 us] [N=50: 289 us] [N=100: 451 us] [N=200: 745 us]
L= 2: [N=2:  81 us] [N=5: 109 us] [N=10:  82 us] [N=20: 153 us] [N=50: 218 us] [N=100: 379 us] [N=200: 631 us]
L= 3: [N=2:  55 us] [N=5:  59 us] [N=10:  75 us] [N=20:  99 us] [N=50: 185 us] [N=100: 299 us] [N=200: 586 us]
L= 5: [N=2:  52 us] [N=5:  58 us] [N=10:  70 us] [N=20:  90 us] [N=50: 171 us] [N=100: 302 us] [N=200: 514 us]
L=10: [N=2:  48 us] [N=5:  53 us] [N=10:  62 us] [N=20:  86 us] [N=50: 153 us] [N=100: 269 us] [N=200: 489 us]
L=20: [N=2:  44 us] [N=5:  52 us] [N=10:  64 us] [N=20:  86 us] [N=50: 147 us] [N=100: 267 us] [N=200: 481 us]
L=50: [N=2:  44 us] [N=5:  47 us] [N=10:  61 us] [N=20:  91 us] [N=50: 153 us] [N=100: 257 us] [N=200: 469 us]

===== Label-set approach (calculate output candidate for each label, look it up in outputs) =====
L= 1: [N=2:  39 us] [N=5:  38 us] [N=10:  37 us] [N=20:  38 us] [N=50:  41 us] [N=100:  48 us] [N=200:  64 us]
L= 2: [N=2:  37 us] [N=5:  36 us] [N=10:  36 us] [N=20:  37 us] [N=50:  41 us] [N=100:  46 us] [N=200:  65 us]
L= 3: [N=2:  40 us] [N=5:  36 us] [N=10:  37 us] [N=20:  39 us] [N=50:  43 us] [N=100:  49 us] [N=200:  75 us]
L= 5: [N=2:  39 us] [N=5:  39 us] [N=10:  41 us] [N=20:  42 us] [N=50:  47 us] [N=100:  51 us] [N=200:  84 us]
L=10: [N=2:  47 us] [N=5:  47 us] [N=10:  45 us] [N=20:  46 us] [N=50:  49 us] [N=100:  55 us] [N=200:  71 us]
L=20: [N=2:  58 us] [N=5:  56 us] [N=10:  58 us] [N=20:  55 us] [N=50:  59 us] [N=100:  66 us] [N=200:  84 us]
L=50: [N=2:  88 us] [N=5:  85 us] [N=10:  93 us] [N=20:  88 us] [N=50:  94 us] [N=100: 100 us] [N=200: 120 us]

The numbers after applying batch inversion:

===== Label-set approach (calculate output candidate for each label, look it up in outputs) =====
L= 1: [N=2:  38 us] [N=5:  37 us] [N=10:  37 us] [N=20:  37 us] [N=50:  41 us] [N=100:  48 us] [N=200:  61 us]
L= 2: [N=2:  35 us] [N=5:  38 us] [N=10:  37 us] [N=20:  44 us] [N=50:  41 us] [N=100:  47 us] [N=200:  62 us]
L= 3: [N=2:  35 us] [N=5:  38 us] [N=10:  39 us] [N=20:  38 us] [N=50:  46 us] [N=100:  46 us] [N=200:  66 us]
L= 5: [N=2:  49 us] [N=5:  36 us] [N=10:  36 us] [N=20:  40 us] [N=50:  42 us] [N=100:  49 us] [N=200:  63 us]
L=10: [N=2:  37 us] [N=5:  37 us] [N=10:  37 us] [N=20:  37 us] [N=50:  41 us] [N=100:  49 us] [N=200:  65 us]
L=20: [N=2:  46 us] [N=5:  47 us] [N=10:  42 us] [N=20:  42 us] [N=50:  48 us] [N=100:  56 us] [N=200:  73 us]
L=50: [N=2:  54 us] [N=5:  46 us] [N=10:  46 us] [N=20:  50 us] [N=50:  52 us] [N=100:  60 us] [N=200:  82 us]

The only code change here is using batch inversion in the label-set path.

@theStack
Copy link
Contributor Author

theStack commented Dec 4, 2025

Here is a first attempt at creating a scanning benchmark for the common case scenario (i.e. no matches and hence only one k iteration), comparing the "BIP" and "Label-set" approaches over a combination of L (number of labels to scan for) and N (number of tx outputs): theStack@fef63fd

This is now available as an actual benchmark using the framework (i.e. implemented in the module's bench_impl.h), leading to more stable results, especially for the lower-N cases (N<=10), which I'd consider to hit most frequently: theStack@8eced64, see https://gist.github.com/theStack/25c77747838610931e8bbeb9d76faf78?permalink_comment_id=5892266#gistcomment-5892266 for the results on my machine. Both approaches are now also benchmarked for no-labels scanning (i.e. L=0) and for the BIP approach, only L=1 is tested, as any higher values don't have an influence on the run-time, if we assume that the label cache lookup cost is negligible.

@w0xlt:

From my understanding, your benchmark confirms that the label-set approach performs better overall and scales more smoothly with the transaction size N (number of outputs). For small N they are comparable, but as N grows the label-set approach becomes significantly faster than the BIP approach.

I agree. It seems to me that, unless there is a really strong reasons to support use-cases with dozens of labels, we should consider switching from the BIP to the Label-set scanning approach. I'm not sure though if larger N values appear frequent enough in blocks nowadays to have a significant influence over the overall scanning time (I'd roughly guess that the vast majority of SP eligible txs wouldn't exceed 10 outputs, though that should be verified), but that could change in the future.

I also evaluated Montgomery batch inversion via secp256k1_ge_set_all_gej_var in the label-set scanning path on top of your commit. This seems to mainly help when there are many labels (large L). For L = 50 the label-set approach becomes noticeably faster still, so it dominates the BIP-style scanning for almost all (L, N) combinations in this benchmark. All runs are using the same “common case” scenario as in your example (one k iteration, no match).

That's great, didn't review this code in detail yet but it seems to be simple enough to be considered on top, if we go with the label set scanning approach; the benchmarks also look very promising, will integrate them and run on my machine as well tomorrow.

Copy link
Member

@furszy furszy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit late but joining this party full-time now. I just finished the sending part. Will keep reviewing. Cool stuff.


seckey_sum_scalar = secp256k1_scalar_zero;
for (i = 0; i < n_plain_seckeys; i++) {
ret = secp256k1_scalar_set_b32_seckey(&addend, plain_seckeys[i]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In ffffd7f:

what if plain_seckeys[i] is null? (same for taproot_seckeys[I] below).

Copy link
Contributor Author

@theStack theStack Dec 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, will add an ARG_CHECK to check for non-NULL, here and in other API functions (scanning) where similar checks for "array of pointers" elements are missing. I've also opened #1779 to do the same for API functions in master for consistency.

Comment on lines +302 to +311
/* BIP0352 specifies that k is serialized as a 4 byte (32 bit) value, so we check to make
* sure we are not exceeding the max value for a uint32 before incrementing k.
* In practice, this should never happen as it would be impossible to create a transaction
* with this many outputs.
*/
if (k < UINT32_MAX) {
k++;
} else {
return 0;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In ffffd7f:

As k depends on n_recipients, wouldn't be simpler to ensure that n_recipients < UINT32_MAX early on the function?

ARG_CHECK(n_recipients > 0 && n_recipients < UINT32_MAX);

Copy link
Contributor Author

@theStack theStack Dec 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bounding the number of recipients seems reasonable to me (even though by protocol they could exceed the 32-bit range in theory, if they are not going all to the same scan pubkey, and we would at some point allow ~185 GB sized transactions :p). I wonder if we could even just set the type of n_recipients to uint32_t instead of size_t, so the limit is already enforced at compile-time? IIRC there were some previous discussions between @josibake and @jonasnick about that topic (in take 3 or 2 of this PR), will look those up next week before changing.

Comment on lines +45 to +48
typedef struct secp256k1_silentpayments_recipient {
secp256k1_pubkey scan_pubkey;
secp256k1_pubkey spend_pubkey;
size_t index;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

q: Instead of having this weird index field in the public API, wouldn't be simpler to include the generated output directly inside secp256k1_silentpayments_recipient?
E.g.

struct secp256k1_silentpayments_recipient {
    /* Inputs */
    secp256k1_pubkey in_scan_pubkey;
    secp256k1_pubkey in_spend_pubkey;
    /* Output */
    secp256k1_xonly_pubkey out_generated_output;
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this change would be slightly simpler for the implementation, but possibly quite confusing for the user. Note that the passed in _silentpayments_recipient array gets sorted in place (to group by scan pubkey), so we need some way to map the generated outputs back to the original recipients. Doing this by just keeping the original order (that's what the index field is needed for) seems the most obvious (expected?) way. Admittedly with your proposed change an user would have the necessary information to do the mapping themselves, but it seems odd to e.g. pass in two recipients (r1, r2), get the result back in a different order (r2, r1) and then having to find out again at what position each recipient has moved, in order to get the corresponding generated output each.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Admittedly with your proposed change an user would have the necessary information to do the mapping themselves, but it seems odd to e.g. pass in two recipients (r1, r2), get the result back in a different order (r2, r1) and then having to find out again at what position each recipient has moved, in order to get the corresponding generated output each.

If you have all the information, does the initial ordering matter?

================================================================================

Also, I woke up creative.. another idea to remove the index field could be to store the initial ordering inside the generated_outputs array. We basically have a n_recipients size array with elements that are essentially buffers :).
So.. we could do something like:

diff --git a/src/modules/silentpayments/main_impl.h b/src/modules/silentpayments/main_impl.h
--- a/src/modules/silentpayments/main_impl.h	(revision ffffd7ff98368b29759cd3d9933896fb9fa69b1f)
+++ b/src/modules/silentpayments/main_impl.h	(date 1765038514517)
@@ -185,13 +185,14 @@
     const unsigned char * const *plain_seckeys,
     size_t n_plain_seckeys
 ) {
-    size_t i, k;
+    size_t i, k, j;
     secp256k1_scalar seckey_sum_scalar, addend, input_hash_scalar;
     secp256k1_ge prevouts_pubkey_sum_ge;
     secp256k1_gej prevouts_pubkey_sum_gej;
     unsigned char shared_secret[33];
     secp256k1_pubkey current_scan_pubkey;
     int ret, sum_is_zero;
+    secp256k1_xonly_pubkey *output = NULL;
 
     /* Sanity check inputs. */
     VERIFY_CHECK(ctx != NULL);
@@ -211,9 +212,6 @@
     } else {
         ARG_CHECK(n_plain_seckeys == 0);
     }
-    for (i = 0; i < n_recipients; i++) {
-        ARG_CHECK(recipients[i]->index == i);
-    }
 
     seckey_sum_scalar = secp256k1_scalar_zero;
     for (i = 0; i < n_plain_seckeys; i++) {
@@ -268,6 +266,13 @@
         return 0;
     }
     secp256k1_scalar_mul(&seckey_sum_scalar, &seckey_sum_scalar, &input_hash_scalar);
+
+    /* Store recipients ordering prior to sorting the array */
+    for (i = 0; i < n_recipients; i++) {
+        /* Use the output struct as a buffer for storing the pointer of the recipient */
+        memcpy(generated_outputs[i]->data, &recipients[i], sizeof(secp256k1_silentpayments_recipient*));
+    }
+
     /* _recipient_sort sorts the array of recipients in place by their scan public keys (lexicographically).
      * This ensures that all recipients with the same scan public key are grouped together, as specified in BIP0352.
      *
@@ -294,7 +299,17 @@
             secp256k1_silentpayments_create_shared_secret(ctx, shared_secret, &pk, &seckey_sum_scalar);
             k = 0;
         }
-        if (!secp256k1_silentpayments_create_output_pubkey(ctx, generated_outputs[recipients[i]->index], shared_secret, &recipients[i]->spend_pubkey, k)) {
+
+        /* Look-up for the correct output to initialize based on the initial ordering */
+        for (j = 0; j < n_recipients; j++) {
+            secp256k1_silentpayments_recipient *ptr = NULL;
+            memcpy(&ptr, generated_outputs[j]->data, sizeof(secp256k1_silentpayments_recipient*));
+            if (ptr == recipients[i]) {
+                output = generated_outputs[j];
+            }
+        }
+
+        if (!secp256k1_silentpayments_create_output_pubkey(ctx, output, shared_secret, &recipients[i]->spend_pubkey, k)) {
             secp256k1_scalar_clear(&seckey_sum_scalar);
             secp256k1_memclear_explicit(&shared_secret, sizeof(shared_secret));
             return 0;

===================================================================
diff --git a/include/secp256k1_silentpayments.h b/include/secp256k1_silentpayments.h
--- a/include/secp256k1_silentpayments.h	(revision ffffd7ff98368b29759cd3d9933896fb9fa69b1f)
+++ b/include/secp256k1_silentpayments.h	(date 1765038119367)
@@ -45,7 +45,6 @@
 typedef struct secp256k1_silentpayments_recipient {
     secp256k1_pubkey scan_pubkey;
     secp256k1_pubkey spend_pubkey;
-    size_t index;
 } secp256k1_silentpayments_recipient;
 
 /** Create Silent Payments outputs for recipient(s).


===================================================================
diff --git a/src/modules/silentpayments/tests_impl.h b/src/modules/silentpayments/tests_impl.h
--- a/src/modules/silentpayments/tests_impl.h	(revision ffffd7ff98368b29759cd3d9933896fb9fa69b1f)
+++ b/src/modules/silentpayments/tests_impl.h	(date 1765038518151)
@@ -98,7 +98,6 @@
     for (i = 0; i < 3; i++) {
         CHECK(secp256k1_ec_pubkey_parse(CTX, &recipients[i].scan_pubkey, (*sp_addresses[i])[0], 33));
         CHECK(secp256k1_ec_pubkey_parse(CTX, &recipients[i].spend_pubkey,(*sp_addresses[i])[1], 33));
-        recipients[i].index = i;
         recipient_ptrs[i] = &recipients[i];
         generated_output_ptrs[i] = &generated_outputs[i];
     }
@@ -174,7 +173,6 @@
         CHECK(secp256k1_ec_pubkey_parse(CTX, &r[i].scan_pubkey, (*sp_addresses[i])[0], 33));
         CHECK(secp256k1_ec_pubkey_parse(CTX, &r[i].spend_pubkey,(*sp_addresses[i])[1], 33));
         /* Set the index value incorrectly */
-        r[i].index = 0;
         rp[i] = &r[i];
         op[i] = &o[i];
     }
@@ -183,13 +181,6 @@
     t[0] = &taproot;
     p[0] = ALICE_SECKEY;
 
-    /* Fails if the index is set incorrectly */
-    CHECK_ILLEGAL(CTX, secp256k1_silentpayments_sender_create_outputs(CTX, op, rp, 2, SMALLEST_OUTPOINT, NULL, 0, p, 1));
-
-    /* Set the index correctly for the next tests */
-    for (i = 0; i < 2; i++) {
-        r[i].index = i;
-    }
     CHECK(secp256k1_silentpayments_sender_create_outputs(CTX, op, rp, 2, SMALLEST_OUTPOINT, NULL, 0, p, 1));
 
     /* Check that null arguments are handled */
@@ -257,7 +248,6 @@
         CHECK(secp256k1_ec_pubkey_negate(CTX, &neg_spend_pubkey));
         r[0].spend_pubkey = neg_spend_pubkey;
         for (i = 0; i < 2; i++) {
-            r[i].index = i;
             rp[i] = &r[i];
         }
         CHECK(secp256k1_silentpayments_sender_create_outputs(CTX, op, rp, 2, SMALLEST_OUTPOINT, NULL, 0, p, 1) == 0);

That being said, this improves the public API at the cost of making the internal code a bit more complex. So I’m not strong on it — I just had some fun thinking about different ways to remove the ugly index field. So no need to take it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have all the information, does the initial ordering matter?

In theory it doesn't, in practice I think some users would be at least surprised or annoyed by getting the generated outputs back in an order that is different from the passed in recipients (some might even lose money by confusing the amounts for e.g. the actual recipient and the change output).

Also, I woke up creative.. another idea to remove the index field could be to store the initial ordering inside the generated_outputs array. We basically have a n_recipients size array with elements that are essentially buffers :).

That's a neat and indeed very creative idea 😃. (Ab)using a pubkey object's underlying memory for temporarily storing a pointer value feels very hacky though, so not sure if we want to introduce these kind of tricks in general. On the other hand, simplifying the API would be indeed a win, so I'm not fully opposed either. Curious what others think about this.

Comment on lines +266 to +270
if (!secp256k1_silentpayments_calculate_input_hash_scalar(&input_hash_scalar, outpoint_smallest36, &prevouts_pubkey_sum_ge)) {
secp256k1_scalar_clear(&seckey_sum_scalar);
return 0;
}
secp256k1_scalar_mul(&seckey_sum_scalar, &seckey_sum_scalar, &input_hash_scalar);
Copy link
Member

@furszy furszy Dec 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In ffffd7f:

This is not realistically possible but.. should also check that the input hash scalar is not the inverse of the aggr sk (aka the result is not 1).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to wrap my head around on what the exact implications would be if that indeed happened. If $a * inputhash = 1$,the shared secret would then be equal to the recipient spend public key $B_{scan}$. That sounds problematic, I suppose it's a loss of privacy as the output tweaks are revealed, but the funds are still safe?
(Wouldn't any other small-ish value also be problematic, as guessing $2* B_{scan}, 3*B_{scan}$ etc. is trivial as well?).

if (!secp256k1_eckey_pubkey_tweak_add(&output_ge, &output_tweak_scalar)) {
secp256k1_scalar_clear(&output_tweak_scalar);
return 0;
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In ffffd7f:

Extra semicolon here.

Comment on lines +87 to +157
static void test_recipient_sort_helper(unsigned char (*sp_addresses[3])[2][33], unsigned char (*sp_outputs[3])[32]) {
unsigned char const *seckey_ptrs[1];
secp256k1_silentpayments_recipient recipients[3];
const secp256k1_silentpayments_recipient *recipient_ptrs[3];
secp256k1_xonly_pubkey generated_outputs[3];
secp256k1_xonly_pubkey *generated_output_ptrs[3];
unsigned char xonly_ser[32];
size_t i;
int ret;

seckey_ptrs[0] = ALICE_SECKEY;
for (i = 0; i < 3; i++) {
CHECK(secp256k1_ec_pubkey_parse(CTX, &recipients[i].scan_pubkey, (*sp_addresses[i])[0], 33));
CHECK(secp256k1_ec_pubkey_parse(CTX, &recipients[i].spend_pubkey,(*sp_addresses[i])[1], 33));
recipients[i].index = i;
recipient_ptrs[i] = &recipients[i];
generated_output_ptrs[i] = &generated_outputs[i];
}
ret = secp256k1_silentpayments_sender_create_outputs(CTX,
generated_output_ptrs,
recipient_ptrs, 3,
SMALLEST_OUTPOINT,
NULL, 0,
seckey_ptrs, 1
);
CHECK(ret == 1);
for (i = 0; i < 3; i++) {
secp256k1_xonly_pubkey_serialize(CTX, xonly_ser, &generated_outputs[i]);
CHECK(secp256k1_memcmp_var(xonly_ser, (*sp_outputs[i]), 32) == 0);
}
}

static void test_recipient_sort(void) {
unsigned char (*sp_addresses[3])[2][33];
unsigned char (*sp_outputs[3])[32];

/* With a fixed set of addresses and a fixed set of inputs,
* test that we always get the same outputs, regardless of the ordering
* of the recipients
*/
sp_addresses[0] = &CAROL_ADDRESS;
sp_addresses[1] = &BOB_ADDRESS;
sp_addresses[2] = &CAROL_ADDRESS;

sp_outputs[0] = &CAROL_OUTPUT_ONE;
sp_outputs[1] = &BOB_OUTPUT;
sp_outputs[2] = &CAROL_OUTPUT_TWO;
test_recipient_sort_helper(sp_addresses, sp_outputs);

sp_addresses[0] = &CAROL_ADDRESS;
sp_addresses[1] = &CAROL_ADDRESS;
sp_addresses[2] = &BOB_ADDRESS;

sp_outputs[0] = &CAROL_OUTPUT_ONE;
sp_outputs[1] = &CAROL_OUTPUT_TWO;
sp_outputs[2] = &BOB_OUTPUT;
test_recipient_sort_helper(sp_addresses, sp_outputs);

sp_addresses[0] = &BOB_ADDRESS;
sp_addresses[1] = &CAROL_ADDRESS;
sp_addresses[2] = &CAROL_ADDRESS;

/* Note: in this case, the second output for Carol comes before the first.
* This is because heapsort is an unstable sorting algorithm, i.e., the ordering
* of identical elements is not guaranteed to be preserved
*/
sp_outputs[0] = &BOB_OUTPUT;
sp_outputs[1] = &CAROL_OUTPUT_TWO;
sp_outputs[2] = &CAROL_OUTPUT_ONE;
test_recipient_sort_helper(sp_addresses, sp_outputs);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In ffffd7f:

The heapsort comment was a bit misleading to me. It only matters because we’re hardcoding the expected outputs, but it's not something we actually care about.

The goal of the test is to make sure that shuffling the inputted recipients never changes which outputs are produced, and that each expected output appears exactly once.

So we don’t need to rely on any final ordering assumptions at all; they depend on heapsort’s stability, which (per comment) is unstable. Take the following if you like it:

static void shuffle(unsigned char (*sp_addresses[])[2][33], size_t size) {
    size_t i, j;
    for (i = size - 1; i > 0; i--) {
        unsigned char (*tmp)[2][33] = sp_addresses[i];
        j = testrand_bits(8) % (i + 1);
        sp_addresses[i] = sp_addresses[j];
        sp_addresses[j] = tmp;
    }
}

/* Ensure that shuffling the inputted recipients never changes
 * which outputs are produced, and that each expected output
 * appears exactly once */
static void test_recipient_sort(void) {
    int i, j;
    unsigned char (*sp_addresses[3])[2][33] = { &CAROL_ADDRESS, &CAROL_ADDRESS, &BOB_ADDRESS };
    unsigned char const *seckey_ptrs[1] = { ALICE_SECKEY };
    unsigned char xonly_ser[32];

    secp256k1_silentpayments_recipient recipients[3];
    const secp256k1_silentpayments_recipient *recipient_ptrs[3];
    secp256k1_xonly_pubkey generated_outputs[3];
    secp256k1_xonly_pubkey *generated_output_ptrs[3];

    for (i = 0; i < 6; i++) {
        int pos_bob = -1, pos_carol_one = -1, pos_carol_two = -1;

        /* Randomize array */
        shuffle(sp_addresses, 3);

        for (j = 0; j < 3; j++) {
            CHECK(secp256k1_ec_pubkey_parse(CTX, &recipients[j].scan_pubkey, (*sp_addresses[j])[0], 33));
            CHECK(secp256k1_ec_pubkey_parse(CTX, &recipients[j].spend_pubkey,(*sp_addresses[j])[1], 33));
            recipients[j].index = j;
            recipient_ptrs[j] = &recipients[j];
            generated_output_ptrs[j] = &generated_outputs[j];
        }
        CHECK(secp256k1_silentpayments_sender_create_outputs(CTX, generated_output_ptrs,
                                                             recipient_ptrs, 3,
                                                             SMALLEST_OUTPOINT,
                                                             NULL, 0,
                                                             seckey_ptrs, 1) == 1);

        /* Verify output correctness and they all appear once */
        for (j = 0; j < 3; j++) {
            int* pos = NULL;
            CHECK(secp256k1_xonly_pubkey_serialize(CTX, xonly_ser, &generated_outputs[j]));
            if (secp256k1_memcmp_var(xonly_ser, BOB_OUTPUT, 32) == 0) {
                pos = &pos_bob;
            } else if (secp256k1_memcmp_var(xonly_ser, CAROL_OUTPUT_ONE, 32) == 0) {
                pos = &pos_carol_one;
            } else if (secp256k1_memcmp_var(xonly_ser, CAROL_OUTPUT_TWO, 32) == 0) {
                pos = &pos_carol_two;
            } else {
                TEST_FAILURE("Error: unknown generated output");
            }
            CHECK(*pos == -1); /* enforce uniqueness */
            *pos = j;
        }
    }
}

Copy link
Contributor Author

@theStack theStack Dec 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking a closer look there. I agree this could be improved.

The goal of the test is to make sure that shuffling the inputted recipients never changes which outputs are produced, and that each expected output appears exactly once.

So we don’t need to rely on any final ordering assumptions at all; they depend on heapsort’s stability, which (per comment) is unstable.

Note that order of the generated outputs does indeed matter and should be tested. The sending API as-is guarantees that for a passed in recipient at position i (recipients[i]), the generated output at the same position (generated_outputs[i]) can be found (and hence be spent later) by that exact recipient. The only thing not guaranteed due to unstable heap sort is the order in which the increasing k values are used for generating those outputs, for recipients that share the same scan public key.

For example, let's say we have three recipients, leading to three outputs:
r0 = (scan_pk, spend_pk_A) -> tx_out0
r1 = (scan_pk, spend_pk_B) -> tx_out1
r2 = (scan_pk, spend_pk_C) -> tx_out2

These three recipients are all in one group, sharing the same ecdh_shared_secret (see BIP-352). The output tweak $t_k$ for creating depends on that shared secret and a counter k that is increased for each spend public key in the group, starting with k=0. Due to heap sort being unstable, the order in which these spend public keys are picked from the group to create outputs is not guaranteed, i.e. it might be that spend_pk_C is used first to create the output with k=0 (-> assigned to tx_out2), then spend_pk_A with k=1 (-> assigned to tx_out0) and lastly spend_pk_B with k=2 (-> assigned to tx_out1). For a recipient the k value that was used for output generation doesn't matter, they will still find them if they include all other tx outputs and follow the scanning protocol correctly. Note that assigning the generated outputs at the intended position (matching the recipient position) is done using the index field in the recipient data structure (see also previous related comment #1765 (comment)).

So in the quoted test code,
CAROL_OUTPUT_ONE means "output to carol, generated with k=0" and
CAROL_OUTPUT_TWO means "output to carol, generated with k=1" accordingly.
If sp_addresses[i] is assigned to BOB_ADDRESS, sp_outputs[i] must be BOB_OUTPUT (this case is unambiguous, the bob's scan pubkey appears only once in the recipient list).
If sp_adresses[i] is assigned to CAROL_ADDRESS, so_outputs[i] must be either CAROL_OUTPUT_ONE or CAROL_OUTPUT_TWO (that scan pubkey appears twice in the recipient list, so the order within those is ambiguous).

I will give this some more thoughts on how to better express this in the test, suggestions of course welcome.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this with my brain based on the other comment; the one I'm saying "If you have all the information, does the initial ordering matter?" (#1765 (comment)). I just read your response there. Thanks.

We can easily adapt the generalized test to verify that the initial recipients’ ordering matches the final outputs order. We just need to cache their initial positions during creation (post-shuffle), then check that Bob stays in the same spot and that the two Carols appear exactly once, working around the heapsort instability. That is simple to do.

Overall, this latest test covers strictly more cases than the previous one. And with this approach, we can also easily confirm we’re not biased by the fixed number of outputs we currently have (right now we only verify arrays of length 3). E.g. it would be nice to test against arrays of length two as well. Also, something very important, it would be good to test k is increased for a second output from Bob when there are two outputs for Carol (If k is only ever increased and not reset or something unexpected happens, we do have a major issue).
I'm not expecting this ^^ to fail, it is just test coverage we should have.

Comment on lines +148 to +150
SECP256K1_API SECP256K1_WARN_UNUSED_RESULT int secp256k1_silentpayments_recipient_create_label(
const secp256k1_context *ctx,
secp256k1_pubkey *label,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure about the label use-case so far but leaving that topic aside (will keep thinking about it),

I'm wondering if we could use a typedef for the label here instead of relying on the public key structure.
At least for me, public keys are meant for sending coins to, and we don’t want anyone sending coins to the group element label = hash(scan_key || m) * G. Clarifying the intent with a simple typedef might save us a few headaches.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants